Overview

Dataset statistics

Number of variables 13
Number of observations 18249
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 1.8 MiB
Average record size in memory 104.0 B

Variable types

Categorical 4
Numeric 9

Alerts

Date has a high cardinality: 169 distinct values High cardinality
region has a high cardinality: 54 distinct values High cardinality
AveragePrice is highly correlated with Total Volume and 6 other fields High correlation
Total Volume is highly correlated with AveragePrice and 7 other fields High correlation
4046 is highly correlated with AveragePrice and 7 other fields High correlation
4225 is highly correlated with AveragePrice and 7 other fields High correlation
4770 is highly correlated with AveragePrice and 7 other fields High correlation
Total Bags is highly correlated with AveragePrice and 7 other fields High correlation
Small Bags is highly correlated with AveragePrice and 7 other fields High correlation
Large Bags is highly correlated with AveragePrice and 7 other fields High correlation
XLarge Bags is highly correlated with Total Volume and 6 other fields High correlation
Total Volume is highly correlated with 4046 and 6 other fields High correlation
4046 is highly correlated with Total Volume and 6 other fields High correlation
4225 is highly correlated with Total Volume and 6 other fields High correlation
4770 is highly correlated with Total Volume and 6 other fields High correlation
Total Bags is highly correlated with Total Volume and 6 other fields High correlation
Small Bags is highly correlated with Total Volume and 6 other fields High correlation
Large Bags is highly correlated with Total Volume and 6 other fields High correlation
XLarge Bags is highly correlated with Total Volume and 6 other fields High correlation
Total Volume is highly correlated with 4046 and 6 other fields High correlation
4046 is highly correlated with Total Volume and 4 other fields High correlation
4225 is highly correlated with Total Volume and 4 other fields High correlation
4770 is highly correlated with Total Volume and 5 other fields High correlation
Total Bags is highly correlated with Total Volume and 6 other fields High correlation
Small Bags is highly correlated with Total Volume and 5 other fields High correlation
Large Bags is highly correlated with Total Volume and 1 other fields High correlation
XLarge Bags is highly correlated with Total Volume and 3 other fields High correlation
AveragePrice is highly correlated with type and 1 other fields High correlation
Total Volume is highly correlated with 4046 and 7 other fields High correlation
4046 is highly correlated with Total Volume and 7 other fields High correlation
4225 is highly correlated with Total Volume and 7 other fields High correlation
4770 is highly correlated with Total Volume and 7 other fields High correlation
Total Bags is highly correlated with Total Volume and 7 other fields High correlation
Small Bags is highly correlated with Total Volume and 7 other fields High correlation
Large Bags is highly correlated with Total Volume and 7 other fields High correlation
XLarge Bags is highly correlated with Total Volume and 6 other fields High correlation
type is highly correlated with AveragePrice High correlation
region is highly correlated with AveragePrice and 7 other fields High correlation
Date is uniformly distributed Uniform
region is uniformly distributed Uniform
4046 has 242 (1.3%) zeros Zeros
4770 has 5497 (30.1%) zeros Zeros
Large Bags has 2370 (13.0%) zeros Zeros
XLarge Bags has 12048 (66.0%) zeros Zeros

Reproduction

Analysis started 2022-05-09 16:45:52.119351
Analysis finished 2022-05-09 16:46:26.847769
Duration 34.73 seconds
Software version pandas-profiling v3.1.0
Download configuration config.json

Variables

Date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct 169
Distinct (%) 0.9%
Missing 0
Missing (%) 0.0%
Memory size 142.7 KiB
2017-03-26
 
108
2016-07-03
 
108
2015-11-08
 
108
2016-10-23
 
108
2015-09-13
 
108
Other values (164)
17709 

Length

Max length 10
Median length 10
Mean length 10
Min length 10

Characters and Unicode

Total characters 0
Distinct characters 0
Distinct categories 0 ?
Distinct scripts 0 ?
Distinct blocks 0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 2015-12-27
2nd row 2015-12-20
3rd row 2015-12-13
4th row 2015-12-06
5th row 2015-11-29

Common Values

Value Count Frequency (%)
2017-03-26 108
 
0.6%
2016-07-03 108
 
0.6%
2015-11-08 108
 
0.6%
2016-10-23 108
 
0.6%
2015-09-13 108
 
0.6%
2017-12-17 108
 
0.6%
2015-04-19 108
 
0.6%
2016-03-27 108
 
0.6%
2016-08-28 108
 
0.6%
2017-04-23 108
 
0.6%
Other values (159) 17169
94.1%

Length

2022-05-09T22:16:27.058650 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
2017-05-21 108
 
0.6%
2018-01-21 108
 
0.6%
2016-08-14 108
 
0.6%
2016-10-30 108
 
0.6%
2017-02-26 108
 
0.6%
2016-01-17 108
 
0.6%
2016-01-03 108
 
0.6%
2017-12-03 108
 
0.6%
2016-03-13 108
 
0.6%
2017-04-09 108
 
0.6%
Other values (159) 17169
94.1%

Most occurring characters

Value Count Frequency (%)
No values found.

Most occurring categories

Value Count Frequency (%)
No values found.

Most frequent character per category

Most occurring scripts

Value Count Frequency (%)
No values found.

Most frequent character per script

Most occurring blocks

Value Count Frequency (%)
No values found.

Most frequent character per block

AveragePrice
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct 259
Distinct (%) 1.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 1.40597841
Minimum 0.44
Maximum 3.25
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:27.306777 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0.44
5-th percentile 0.83
Q1 1.1
median 1.37
Q3 1.66
95-th percentile 2.11
Maximum 3.25
Range 2.81
Interquartile range (IQR) 0.56

Descriptive statistics

Standard deviation 0.4026765555
Coefficient of variation (CV) 0.2864030861
Kurtosis 0.3251958507
Mean 1.40597841
Median Absolute Deviation (MAD) 0.28
Skewness 0.5803027379
Sum 25657.7
Variance 0.1621484083
Monotonicity Not monotonic
2022-05-09T22:16:27.540646 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1.15 202
 
1.1%
1.18 199
 
1.1%
1.08 194
 
1.1%
1.26 193
 
1.1%
1.13 192
 
1.1%
0.98 189
 
1.0%
1.19 188
 
1.0%
1.36 187
 
1.0%
1.59 186
 
1.0%
0.99 185
 
1.0%
Other values (249) 16334
89.5%
Value Count Frequency (%)
0.44 1
 
< 0.1%
0.46 1
 
< 0.1%
0.48 1
 
< 0.1%
0.49 2
 
< 0.1%
0.51 5
< 0.1%
0.52 3
 
< 0.1%
0.53 6
< 0.1%
0.54 7
< 0.1%
0.55 3
 
< 0.1%
0.56 12
0.1%
Value Count Frequency (%)
3.25 1
< 0.1%
3.17 1
< 0.1%
3.12 1
< 0.1%
3.05 1
< 0.1%
3.04 1
< 0.1%
3.03 1
< 0.1%
3 2
< 0.1%
2.99 2
< 0.1%
2.97 1
< 0.1%
2.96 1
< 0.1%

Total Volume
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct 18237
Distinct (%) 99.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 850644.013
Minimum 84.56
Maximum 62505646.52
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:27.821482 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 84.56
5-th percentile 2371.862
Q1 10838.58
median 107376.76
Q3 432962.29
95-th percentile 3716315.41
Maximum 62505646.52
Range 62505561.96
Interquartile range (IQR) 422123.71

Descriptive statistics

Standard deviation 3453545.355
Coefficient of variation (CV) 4.059918488
Kurtosis 92.10445778
Mean 850644.013
Median Absolute Deviation (MAD) 102962.47
Skewness 9.007687479
Sum 1.552340259 × 1010
Variance 1.192697552 × 1013
Monotonicity Not monotonic
2022-05-09T22:16:28.092023 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
3713.49 2
 
< 0.1%
3529.44 2
 
< 0.1%
2038.99 2
 
< 0.1%
569349.05 2
 
< 0.1%
4103.97 2
 
< 0.1%
9465.99 2
 
< 0.1%
46602.16 2
 
< 0.1%
2858.31 2
 
< 0.1%
7223.46 2
 
< 0.1%
19634.24 2
 
< 0.1%
Other values (18227) 18229
99.9%
Value Count Frequency (%)
84.56 1
< 0.1%
379.82 1
< 0.1%
385.55 1
< 0.1%
419.98 1
< 0.1%
472.82 1
< 0.1%
482.26 1
< 0.1%
515.01 1
< 0.1%
530.96 1
< 0.1%
542.85 1
< 0.1%
561.1 1
< 0.1%
Value Count Frequency (%)
62505646.52 1
< 0.1%
61034457.1 1
< 0.1%
52288697.89 1
< 0.1%
47293921.6 1
< 0.1%
46324529.7 1
< 0.1%
44655461.51 1
< 0.1%
43409835.75 1
< 0.1%
43167806.09 1
< 0.1%
42939821.55 1
< 0.1%
42867608.54 1
< 0.1%

4046
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct 17702
Distinct (%) 97.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 293008.4245
Minimum 0
Maximum 22743616.17
Zeros 242
Zeros (%) 1.3%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:28.386855 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 19.6
Q1 854.07
median 8645.3
Q3 111020.2
95-th percentile 1263359.678
Maximum 22743616.17
Range 22743616.17
Interquartile range (IQR) 110166.13

Descriptive statistics

Standard deviation 1264989.082
Coefficient of variation (CV) 4.317244747
Kurtosis 86.80911256
Mean 293008.4245
Median Absolute Deviation (MAD) 8616.69
Skewness 8.648219757
Sum 5347110739
Variance 1.600197377 × 1012
Monotonicity Not monotonic
2022-05-09T22:16:28.674707 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 242
 
1.3%
3 10
 
0.1%
1.24 8
 
< 0.1%
1 8
 
< 0.1%
4 8
 
< 0.1%
1.25 7
 
< 0.1%
6 7
 
< 0.1%
1.21 6
 
< 0.1%
2.54 5
 
< 0.1%
1.27 5
 
< 0.1%
Other values (17692) 17943
98.3%
Value Count Frequency (%)
0 242
1.3%
1 8
 
< 0.1%
1.13 1
 
< 0.1%
1.19 3
 
< 0.1%
1.2 1
 
< 0.1%
1.21 6
 
< 0.1%
1.22 5
 
< 0.1%
1.23 1
 
< 0.1%
1.24 8
 
< 0.1%
1.25 7
 
< 0.1%
Value Count Frequency (%)
22743616.17 1
< 0.1%
21620180.9 1
< 0.1%
18933038.04 1
< 0.1%
17787611.93 1
< 0.1%
17076650.82 1
< 0.1%
16573573.78 1
< 0.1%
16529797.6 1
< 0.1%
16383685.07 1
< 0.1%
16215328.75 1
< 0.1%
16000107.8 1
< 0.1%

4225
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct 18103
Distinct (%) 99.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 295154.5684
Minimum 0
Maximum 20470572.61
Zeros 61
Zeros (%) 0.3%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:28.972520 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 103.614
Q1 3008.78
median 29061.02
Q3 150206.86
95-th percentile 1303657.658
Maximum 20470572.61
Range 20470572.61
Interquartile range (IQR) 147198.08

Descriptive statistics

Standard deviation 1204120.401
Coefficient of variation (CV) 4.079626508
Kurtosis 91.94902197
Mean 295154.5684
Median Absolute Deviation (MAD) 28521.3
Skewness 8.942465608
Sum 5386275718
Variance 1.44990594 × 1012
Monotonicity Not monotonic
2022-05-09T22:16:29.250361 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 61
 
0.3%
215.36 3
 
< 0.1%
177.87 3
 
< 0.1%
1.3 3
 
< 0.1%
94.74 3
 
< 0.1%
1.26 3
 
< 0.1%
3478.97 2
 
< 0.1%
61.01 2
 
< 0.1%
65.22 2
 
< 0.1%
5.73 2
 
< 0.1%
Other values (18093) 18165
99.5%
Value Count Frequency (%)
0 61
0.3%
1.26 3
 
< 0.1%
1.28 2
 
< 0.1%
1.3 3
 
< 0.1%
1.31 1
 
< 0.1%
1.32 2
 
< 0.1%
1.64 1
 
< 0.1%
2.39 1
 
< 0.1%
2.4 1
 
< 0.1%
2.48 1
 
< 0.1%
Value Count Frequency (%)
20470572.61 1
< 0.1%
20445501.03 1
< 0.1%
20328161.55 1
< 0.1%
18956479.74 1
< 0.1%
17896391.6 1
< 0.1%
16602589.04 1
< 0.1%
16054083.86 1
< 0.1%
15899858.37 1
< 0.1%
14888077.69 1
< 0.1%
14437190.03 1
< 0.1%

4770
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct 12071
Distinct (%) 66.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 22839.73599
Minimum 0
Maximum 2546439.11
Zeros 5497
Zeros (%) 30.1%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:29.531196 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 184.99
Q3 6243.42
95-th percentile 106156.574
Maximum 2546439.11
Range 2546439.11
Interquartile range (IQR) 6243.42

Descriptive statistics

Standard deviation 107464.0684
Coefficient of variation (CV) 4.705136192
Kurtosis 132.5634409
Mean 22839.73599
Median Absolute Deviation (MAD) 184.99
Skewness 10.15939563
Sum 416802342.1
Variance 1.1548526 × 1010
Monotonicity Not monotonic
2022-05-09T22:16:29.837021 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 5497
30.1%
2.66 7
 
< 0.1%
3.32 7
 
< 0.1%
1.64 6
 
< 0.1%
10.97 6
 
< 0.1%
1.6 6
 
< 0.1%
1.59 6
 
< 0.1%
2.74 5
 
< 0.1%
1.65 5
 
< 0.1%
1.63 5
 
< 0.1%
Other values (12061) 12699
69.6%
Value Count Frequency (%)
0 5497
30.1%
0.83 1
 
< 0.1%
1 3
 
< 0.1%
1.01 1
 
< 0.1%
1.09 1
 
< 0.1%
1.11 1
 
< 0.1%
1.12 1
 
< 0.1%
1.15 1
 
< 0.1%
1.16 1
 
< 0.1%
1.18 5
 
< 0.1%
Value Count Frequency (%)
2546439.11 1
< 0.1%
1993645.36 1
< 0.1%
1896149.5 1
< 0.1%
1880231.38 1
< 0.1%
1811090.71 1
< 0.1%
1800065.57 1
< 0.1%
1773088.87 1
< 0.1%
1770948.09 1
< 0.1%
1761343.08 1
< 0.1%
1753852.61 1
< 0.1%

Total Bags
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct 18097
Distinct (%) 99.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 239639.2021
Minimum 0
Maximum 19373134.37
Zeros 15
Zeros (%) 0.1%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:30.194836 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 628.89
Q1 5088.64
median 39743.83
Q3 110783.37
95-th percentile 1005478.892
Maximum 19373134.37
Range 19373134.37
Interquartile range (IQR) 105694.73

Descriptive statistics

Standard deviation 986242.3992
Coefficient of variation (CV) 4.115530309
Kurtosis 112.2721565
Mean 239639.2021
Median Absolute Deviation (MAD) 37299.96
Skewness 9.75607167
Sum 4373175798
Variance 9.7267407 × 1011
Monotonicity Not monotonic
2022-05-09T22:16:30.509637 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 15
 
0.1%
300 5
 
< 0.1%
990 5
 
< 0.1%
916.67 4
 
< 0.1%
266.67 4
 
< 0.1%
550 4
 
< 0.1%
856.67 3
 
< 0.1%
153.33 3
 
< 0.1%
196.67 3
 
< 0.1%
803.33 3
 
< 0.1%
Other values (18087) 18200
99.7%
Value Count Frequency (%)
0 15
0.1%
3.09 1
 
< 0.1%
3.11 1
 
< 0.1%
3.19 1
 
< 0.1%
3.33 1
 
< 0.1%
6.14 1
 
< 0.1%
6.18 1
 
< 0.1%
6.24 1
 
< 0.1%
6.36 1
 
< 0.1%
7.02 1
 
< 0.1%
Value Count Frequency (%)
19373134.37 1
< 0.1%
16394524.11 1
< 0.1%
16298296.29 1
< 0.1%
15972492.07 1
< 0.1%
15804696.31 1
< 0.1%
15102426.94 1
< 0.1%
15051877.14 1
< 0.1%
14894893.8 1
< 0.1%
14504209.37 1
< 0.1%
14440611.5 1
< 0.1%

Small Bags
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct 17321
Distinct (%) 94.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 182194.6867
Minimum 0
Maximum 13384586.8
Zeros 159
Zeros (%) 0.9%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:30.850460 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 256.67
Q1 2849.42
median 26362.82
Q3 83337.67
95-th percentile 768147.228
Maximum 13384586.8
Range 13384586.8
Interquartile range (IQR) 80488.25

Descriptive statistics

Standard deviation 746178.515
Coefficient of variation (CV) 4.095500964
Kurtosis 107.0128851
Mean 182194.6867
Median Absolute Deviation (MAD) 25599.49
Skewness 9.540659982
Sum 3324870838
Variance 5.567823762 × 1011
Monotonicity Not monotonic
2022-05-09T22:16:31.222229 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 159
 
0.9%
203.33 11
 
0.1%
533.33 10
 
0.1%
223.33 10
 
0.1%
103.33 8
 
< 0.1%
326.67 8
 
< 0.1%
300 8
 
< 0.1%
196.67 8
 
< 0.1%
263.33 8
 
< 0.1%
123.33 8
 
< 0.1%
Other values (17311) 18011
98.7%
Value Count Frequency (%)
0 159
0.9%
2.52 1
 
< 0.1%
2.57 1
 
< 0.1%
2.73 1
 
< 0.1%
2.79 1
 
< 0.1%
2.95 3
 
< 0.1%
2.96 1
 
< 0.1%
3.06 1
 
< 0.1%
3.09 1
 
< 0.1%
3.11 1
 
< 0.1%
Value Count Frequency (%)
13384586.8 1
< 0.1%
12567155.58 1
< 0.1%
12540327.19 1
< 0.1%
11712807.19 1
< 0.1%
11392828.89 1
< 0.1%
11228049.63 1
< 0.1%
11112405.61 1
< 0.1%
10844852.22 1
< 0.1%
10832907.44 1
< 0.1%
10666942.78 1
< 0.1%

Large Bags
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct 15082
Distinct (%) 82.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 54338.08814
Minimum 0
Maximum 5719096.61
Zeros 2370
Zeros (%) 13.0%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:31.580023 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 127.47
median 2647.71
Q3 22029.25
95-th percentile 195699.768
Maximum 5719096.61
Range 5719096.61
Interquartile range (IQR) 21901.78

Descriptive statistics

Standard deviation 243965.9645
Coefficient of variation (CV) 4.489778218
Kurtosis 117.999481
Mean 54338.08814
Median Absolute Deviation (MAD) 2647.71
Skewness 9.796454599
Sum 991615770.5
Variance 5.951939186 × 1010
Monotonicity Not monotonic
2022-05-09T22:16:31.856864 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 2370
 
13.0%
3.33 187
 
1.0%
6.67 78
 
0.4%
10 47
 
0.3%
4.44 38
 
0.2%
13.33 28
 
0.2%
16.67 18
 
0.1%
6.66 18
 
0.1%
26.67 18
 
0.1%
20 14
 
0.1%
Other values (15072) 15433
84.6%
Value Count Frequency (%)
0 2370
13.0%
0.97 1
 
< 0.1%
1.3 1
 
< 0.1%
1.33 1
 
< 0.1%
1.38 2
 
< 0.1%
1.44 1
 
< 0.1%
1.48 1
 
< 0.1%
1.55 1
 
< 0.1%
1.56 1
 
< 0.1%
1.62 1
 
< 0.1%
Value Count Frequency (%)
5719096.61 1
< 0.1%
4324231.19 1
< 0.1%
4081397.72 1
< 0.1%
4023485.04 1
< 0.1%
3988101.74 1
< 0.1%
3917569.95 1
< 0.1%
3789722.9 1
< 0.1%
3618270.75 1
< 0.1%
3544729.39 1
< 0.1%
3434846.78 1
< 0.1%

XLarge Bags
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct 5588
Distinct (%) 30.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 3106.426507
Minimum 0
Maximum 551693.65
Zeros 12048
Zeros (%) 66.0%
Negative 0
Negative (%) 0.0%
Memory size 142.7 KiB
2022-05-09T22:16:32.148697 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 0
median 0
Q3 132.5
95-th percentile 12058.452
Maximum 551693.65
Range 551693.65
Interquartile range (IQR) 132.5

Descriptive statistics

Standard deviation 17692.89465
Coefficient of variation (CV) 5.695578058
Kurtosis 233.6026119
Mean 3106.426507
Median Absolute Deviation (MAD) 0
Skewness 13.13975069
Sum 56689177.33
Variance 313038521.2
Monotonicity Not monotonic
2022-05-09T22:16:32.403980 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0 12048
66.0%
3.33 29
 
0.2%
6.67 16
 
0.1%
1.11 15
 
0.1%
5 12
 
0.1%
10 9
 
< 0.1%
16.67 8
 
< 0.1%
2.22 7
 
< 0.1%
150 6
 
< 0.1%
80 6
 
< 0.1%
Other values (5578) 6093
33.4%
Value Count Frequency (%)
0 12048
66.0%
1 1
 
< 0.1%
1.11 15
 
0.1%
1.26 1
 
< 0.1%
1.3 1
 
< 0.1%
1.38 1
 
< 0.1%
1.41 2
 
< 0.1%
1.45 1
 
< 0.1%
1.47 4
 
< 0.1%
1.49 2
 
< 0.1%
Value Count Frequency (%)
551693.65 1
< 0.1%
454343.65 1
< 0.1%
390478.73 1
< 0.1%
387400.22 1
< 0.1%
377661.06 1
< 0.1%
373523.47 1
< 0.1%
347390.14 1
< 0.1%
328589.09 1
< 0.1%
326348.15 1
< 0.1%
321033.23 1
< 0.1%

type
Categorical

HIGH CORRELATION

Distinct 2
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 142.7 KiB
conventional
9126 
organic
9123 

Length

Max length 12
Median length 12
Mean length 9.500410981
Min length 7

Characters and Unicode

Total characters 0
Distinct characters 0
Distinct categories 0 ?
Distinct scripts 0 ?
Distinct blocks 0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row conventional
2nd row conventional
3rd row conventional
4th row conventional
5th row conventional

Common Values

Value Count Frequency (%)
conventional 9126
50.0%
organic 9123
50.0%

Length

2022-05-09T22:16:33.006650 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-09T22:16:33.176091 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Value Count Frequency (%)
conventional 9126
50.0%
organic 9123
50.0%

Most occurring characters

Value Count Frequency (%)
No values found.

Most occurring categories

Value Count Frequency (%)
No values found.

Most frequent character per category

Most occurring scripts

Value Count Frequency (%)
No values found.

Most frequent character per script

Most occurring blocks

Value Count Frequency (%)
No values found.

Most frequent character per block

year
Categorical

Distinct 4
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 142.7 KiB
2017
5722 
2016
5616 
2015
5615 
2018
1296 

Length

Max length 4
Median length 4
Mean length 4
Min length 4

Characters and Unicode

Total characters 0
Distinct characters 0
Distinct categories 0 ?
Distinct scripts 0 ?
Distinct blocks 0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 2015
2nd row 2015
3rd row 2015
4th row 2015
5th row 2015

Common Values

Value Count Frequency (%)
2017 5722
31.4%
2016 5616
30.8%
2015 5615
30.8%
2018 1296
 
7.1%

Length

2022-05-09T22:16:33.362984 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-09T22:16:33.527894 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Value Count Frequency (%)
2017 5722
31.4%
2016 5616
30.8%
2015 5615
30.8%
2018 1296
 
7.1%

Most occurring characters

Value Count Frequency (%)
No values found.

Most occurring categories

Value Count Frequency (%)
No values found.

Most frequent character per category

Most occurring scripts

Value Count Frequency (%)
No values found.

Most frequent character per script

Most occurring blocks

Value Count Frequency (%)
No values found.

Most frequent character per block

region
Categorical

HIGH CARDINALITY
HIGH CORRELATION
UNIFORM

Distinct 54
Distinct (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory size 142.7 KiB
BaltimoreWashington
 
338
SanFrancisco
 
338
MiamiFtLauderdale
 
338
HarrisburgScranton
 
338
TotalUS
 
338
Other values (49)
16559 

Length

Max length 19
Median length 9
Mean length 10.29535865
Min length 4

Characters and Unicode

Total characters 0
Distinct characters 0
Distinct categories 0 ?
Distinct scripts 0 ?
Distinct blocks 0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Albany
2nd row Albany
3rd row Albany
4th row Albany
5th row Albany

Common Values

Value Count Frequency (%)
BaltimoreWashington 338
 
1.9%
SanFrancisco 338
 
1.9%
MiamiFtLauderdale 338
 
1.9%
HarrisburgScranton 338
 
1.9%
TotalUS 338
 
1.9%
Southeast 338
 
1.9%
Louisville 338
 
1.9%
Pittsburgh 338
 
1.9%
Albany 338
 
1.9%
StLouis 338
 
1.9%
Other values (44) 14869
81.5%

Length

2022-05-09T22:16:33.774749 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
southcarolina 338
 
1.9%
detroit 338
 
1.9%
midsouth 338
 
1.9%
syracuse 338
 
1.9%
jacksonville 338
 
1.9%
roanoke 338
 
1.9%
newyork 338
 
1.9%
dallasftworth 338
 
1.9%
houston 338
 
1.9%
atlanta 338
 
1.9%
Other values (44) 14869
81.5%

Most occurring characters

Value Count Frequency (%)
No values found.

Most occurring categories

Value Count Frequency (%)
No values found.

Most frequent character per category

Most occurring scripts

Value Count Frequency (%)
No values found.

Most frequent character per script

Most occurring blocks

Value Count Frequency (%)
No values found.

Most frequent character per block

Interactions

2022-05-09T22:16:22.310835 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:00.419158 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:03.178797 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:05.946195 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:08.740595 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:11.302128 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:13.875651 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:16.719024 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:19.457450 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:22.580665 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:00.825128 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:03.465615 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:06.216039 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:09.018435 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:11.571972 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:14.179476 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:17.009853 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:19.739290 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:22.888489 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:01.113963 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:03.757468 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:06.499878 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:09.323260 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:11.858805 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:14.488318 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:17.315681 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:20.066104 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:23.471154 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:01.384810 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:04.026294 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:06.783717 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:09.575133 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:12.125671 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:14.783148 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:17.608511 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:20.364932 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:23.730004 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:01.697630 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:04.296143 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:07.037569 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:09.834986 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:12.391518 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:15.118939 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:17.902344 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:20.655766 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:24.002851 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:01.976487 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:04.650935 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:07.301436 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:10.126798 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:12.662363 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:15.426760 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:18.198173 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:20.979579 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:24.326679 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:02.294288 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:05.007732 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:07.888081 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:10.446618 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:13.007150 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:15.760573 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:18.529985 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:21.345374 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:24.639481 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:02.588117 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:05.340544 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:08.182914 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:10.744444 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:13.310977 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:16.094398 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:18.847805 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:21.714176 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:24.926317 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:02.905948 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:05.631375 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:08.468748 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:11.028282 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:13.596812 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:16.414194 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:19.158626 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
2022-05-09T22:16:22.024981 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-05-09T22:16:34.056587 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-09T22:16:34.527420 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-09T22:16:34.962188 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-09T22:16:35.619795 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-09T22:16:35.881662 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-09T22:16:25.426414 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-09T22:16:26.495833 image/svg+xml Matplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Date AveragePrice Total Volume 4046 4225 4770 Total Bags Small Bags Large Bags XLarge Bags type year region
0 2015-12-27 1.33 64236.62 1036.74 54454.85 48.16 8696.87 8603.62 93.25 0.0 conventional 2015 Albany
1 2015-12-20 1.35 54876.98 674.28 44638.81 58.33 9505.56 9408.07 97.49 0.0 conventional 2015 Albany
2 2015-12-13 0.93 118220.22 794.70 109149.67 130.50 8145.35 8042.21 103.14 0.0 conventional 2015 Albany
3 2015-12-06 1.08 78992.15 1132.00 71976.41 72.58 5811.16 5677.40 133.76 0.0 conventional 2015 Albany
4 2015-11-29 1.28 51039.60 941.48 43838.39 75.78 6183.95 5986.26 197.69 0.0 conventional 2015 Albany
5 2015-11-22 1.26 55979.78 1184.27 48067.99 43.61 6683.91 6556.47 127.44 0.0 conventional 2015 Albany
6 2015-11-15 0.99 83453.76 1368.92 73672.72 93.26 8318.86 8196.81 122.05 0.0 conventional 2015 Albany
7 2015-11-08 0.98 109428.33 703.75 101815.36 80.00 6829.22 6266.85 562.37 0.0 conventional 2015 Albany
8 2015-11-01 1.02 99811.42 1022.15 87315.57 85.34 11388.36 11104.53 283.83 0.0 conventional 2015 Albany
9 2015-10-25 1.07 74338.76 842.40 64757.44 113.00 8625.92 8061.47 564.45 0.0 conventional 2015 Albany

Last rows

Date AveragePrice Total Volume 4046 4225 4770 Total Bags Small Bags Large Bags XLarge Bags type year region
18239 2018-03-11 1.56 22128.42 2162.67 3194.25 8.93 16762.57 16510.32 252.25 0.0 organic 2018 WestTexNewMexico
18240 2018-03-04 1.54 17393.30 1832.24 1905.57 0.00 13655.49 13401.93 253.56 0.0 organic 2018 WestTexNewMexico
18241 2018-02-25 1.57 18421.24 1974.26 2482.65 0.00 13964.33 13698.27 266.06 0.0 organic 2018 WestTexNewMexico
18242 2018-02-18 1.56 17597.12 1892.05 1928.36 0.00 13776.71 13553.53 223.18 0.0 organic 2018 WestTexNewMexico
18243 2018-02-11 1.57 15986.17 1924.28 1368.32 0.00 12693.57 12437.35 256.22 0.0 organic 2018 WestTexNewMexico
18244 2018-02-04 1.63 17074.83 2046.96 1529.20 0.00 13498.67 13066.82 431.85 0.0 organic 2018 WestTexNewMexico
18245 2018-01-28 1.71 13888.04 1191.70 3431.50 0.00 9264.84 8940.04 324.80 0.0 organic 2018 WestTexNewMexico
18246 2018-01-21 1.87 13766.76 1191.92 2452.79 727.94 9394.11 9351.80 42.31 0.0 organic 2018 WestTexNewMexico
18247 2018-01-14 1.93 16205.22 1527.63 2981.04 727.01 10969.54 10919.54 50.00 0.0 organic 2018 WestTexNewMexico
18248 2018-01-07 1.62 17489.58 2894.77 2356.13 224.53 12014.15 11988.14 26.01 0.0 organic 2018 WestTexNewMexico